Text characteristics of English language university Web sites

نویسنده

  • Mike Thelwall
چکیده

The nature of the contents of academic Web sites is of direct relevance to the new field of scientific Web intelligence, and for search engine and topic-specific crawler designers. We analyze word frequencies in national academic Webs using the Web sites of three English-speaking nations: Australia, New Zealand and the U.K. Strong regularities were found in page size and word frequency distributions, but with significant anomalies. At least 26% of pages contain no words. High frequency words include university names and acronyms, Internet terminology, and computing product names: not always words in common usage away from the Web. A minority of low frequency words are spelling mistakes, with other common types including non-words, proper names, foreign language terms or computer science variable names. Based upon these findings, recommendations for data cleansing and filtering are made, particularly for clustering applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

English Teachers Professional Development Needs for Web Development Skills: Meeting the Challenges of Teaching English Language in the Information Age

Utilizing the resources of the web in educational practices has made instructional processes more efficient and interesting and has made the learning process on the other hand much easier and attractive. With the web, English language teachers now have the option of engaging learners in online (web-based) instructions in addition to the use of conventional classroom instructions or alternativel...

متن کامل

Web-Based Communication of Global Companies: Do Languages and Culture Matter?

This project studies the use of language and cultural adaptation of Web sites of forty-one Global Fortune 500 companies from the developed world and emerging markets. Companies were selected from the Big Five (France, Germany, Japan, UK, US) and BRIC nations (Brazil, Russia, India, and China). The study investigates the extent to which English is the language of the Web for global companies, an...

متن کامل

A Comparison of ESLE Web-based English Vocabulary Learning Application with Traditional Desktop English Vocabulary Learning Application: Exceptional learner parents’ point of view

The aim of this study was to compare the Exceptional Student Learning English (ESLE) web application and traditional application and the evaluation of the ESLE app mainly from the exceptional student parents' perspective. To this end, five exceptional student parents with their exceptional children were selected among 30 parents in Isfahan in Isfahan province. Open-ended questionnaires were sen...

متن کامل

The Impact of Input Enrichment in Long Text vs. Short Texts on Grammatical Accuracy in Writing Among Elementary Language Learners

This study was conducted to investigate the influence of teaching accurate grammar inwriting via enriched long text and short text for the elementary students atShokouhe_Farhang institute. The homogenized subjects were divided into two groups of 18and 17 participants. Using a writing exam as a pretest in order to check the students’knowledge in English past tense. The control group received the...

متن کامل

The Effect of Flipped Language Teaching on EFL Learners’ Text Comprehension: Learners’ English Proficiency Level in Focus

The current pretest-posttest quasi-experimental study sought, firstly, to examine the effect of employing flipped language teaching techniques on EFL learners' text comprehension and, secondly, to explore whether there was any significant interaction between the flipped classroom approach and EFL learners’ proficiency level. To this end, 65 male and female EFL learners were conveniently selecte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JASIST

دوره 56  شماره 

صفحات  -

تاریخ انتشار 2005